MDM File Extract Customer

Prerequisites

The following are the prerequisite conditions for Generic Customer Multi Source Data Load to China hosted Reltio:

Area Prerequisite

Task Group Execution

At least once, Reltio_MDM_Extract task group must be executed successfully.

Notification Channels At least one notification channel must be configured among Email IDMS Teams and Slack. For more information, see Adding Alert Notification.

MDM File Extract Customer Overview

The Reltio MDM Customer Data Extract process helps to download the data from Reltio using the s3Connector and sqlExecutor plugins. The data extract process is given below:

  1. At the first stage, data from Reltio MDM is extracted to the IDP Platform in the Landing Tables and then to Staging Tables in JSON format.

  2. The Views are created from the data available at the Staging Tables.

  3. The data extracted in the Views are exported into flat files which are stored in S3 Bucket.

  4. Data files can be extracted in two different modes using the parameter ExportDataRange. They are Full and Incremental.

  5. The Full extract mode extracts the entire data from Reltio.

  6. In the case of incremental extract mode, the file extract process identifies the entities that are to be extracted to flat files with the help of Last Run Date in the extract normal views and stores the information of these entities in Publish Control Table (ODP_CORE_LOG.MDM_EXTRACT_PUB_CNTL) with the status as PENDING_EXTRACT.

  7. All the information (attributes, nested attributes, relations & merges) is extracted for each customer entity in the extract publish control table. After the completion of file extraction process, status is set to SUCCESSFUL.

  8. Files are extracted to s3 location (<defaultBucket>/customer_extract/<Reltio_Connection_Name>/<DateTimestamp>/).

Follow the below steps to execute the Customer Data Extract

  1. Import the task group template.

  2. Configure the task group.

  3. Extract data from Reltio.

Import the Task Group Template

To extract the Customer Data from Reltio, you need to first create a Task Group in IDP OA platform to execute the extract process. Follow the below steps to create a task group in IDP OA.

  1. Using WinSCP or s3 browser, connect to the IDP default s3 bucket and go to the folder <bucket_name>/templates/product.

  2. Find the template MDM_File_Extract_Customer_<version>.json and download the latest version template to the local machine.

  3. Open the pipeline template MDM_File_Extract_Customer_<version>.json in any text editor and replace all the occurrences of the below place holders and save the file.

    Placeholder

    Replaceable String

    Description

    Default Value

    <RELTIO_CONN_NAME>

    Reltio Connection name configured in Entity Collection.

    For example. RELTIO_MDM_CM

    Reltio connection name used to extract reltio customer data to staging and outbound views.

    No default value. Should be provided with valid value.

    <DATABASE_NAME>

    Database Name

    For example. IDP_IQVIADEV_OAIDP_DEV1_USV_ENV_DWH

     

    No default value. Should be provided with valid database name.

    <IDP_SCHEMA_NAME>

    Extract outbound views schema name

    For example. MDM_OUT_STAGING

     

    No default value. Should be provided with valid schema name.

  4. Login to IDP OA platform and under the Data Management section, click Data Pipeline.

  5. On the Landing Page, click Data Pipeline tile to open the Task Group Pipeline Flow.

  6. Click Task Group from Template, select the latest downloaded template MDM_File_Extract_Customer and then click OPEN.

  7. The pipeline task group for MDM_File_Extract_Customer will be created. This task group is used for executing the data extract process.

Configure the Task Group

Verify the below task group and task parameters. Configure them accordingly based on the requirement.

Parameter Name

Parameter Level

Default Value

Description

IS_FULL_EXTRACT

Task Group

false

To extract full customer data to files, change the value of this parameter to true. 

Keep the value as false, to extract incremental (delta) data.

Example:

For value true, date used to extract the data to files is always between 1900-01-02 00:00:00 to current timestamp.

For value false, date used to extract the data to files is the previous extract date to current timestamp.

    File Extract Run1: 1900-01-02 00:00:00 to 2021-08-17 03:52:17

File Extract Run2: 2021-08-17 03:52:18 to 2021-08-18 09:33:46

File Extract Run3: 2021-08-18 09:33:47 to 2021-08-19 11:01:21

EXTRACT_FILE_FORMAT

Task (File_Extract)

CSV

Extract file format.

Can be any one of the below values.

CSV = pipe delimited

CSV_COMPRESSED = pipe delimited with gz compression.

PARQUET = parquet

PARQUET_COMPRESSED = parquet with snappy compression

EXTRACT_FILE_NAME_PREFIX

Task (File_Extract)

MDM_CM

Value of this parameter is used as a prefix name to all the extracted files. Change this value if a different prefix name is required.

Example:

MDM_CM_HCO_20210816092140.csv

MDM_CM_HCP_20210816092140.csv

MDM_CM_COMM_20210816092140.csv

MDM_CM_AFFILIATION_20210816092140.csv

EXTRACT_FILE_NAME_WITH_DATETIMESTAMP

Task (File_Extract)

false

If true, then extract file name is suffixed with extract date timestamp.

Example: MDM_CM_HCO_20210816092140.csv

If false, then extract file name cannot have extract date timestamp as suffix.

Example: MDM_CM_HCO.csv

EXTRACT_S3_BASE_FOLDER

Task (File_Extract)

customer_extract

This is the customer extract base folder where extract files are stored.

Example: customer_extract/reltio_mdm_cm/20210816092140/

Extract Data from Reltio

After the successful import, navigate to the task group MDM_File_Extract_Customer.

Click the MDM_File_Extract_Customer task group to open it. See figure below.

Navigate to the Tasks tab. The tasks associated with Reltio File Extract Customer are listed in the Tasks tab. There are three tasks present in this task group (MDM_File_Extract_Customer). See figure below.

  1. File Extract Views Creation

  2. File Extract to S3

  3. Update_Extract_Publish_Control

The task File Extract Views Creation performs the below steps:

  1. This task will create the extract views based on the query configuration set in the pipeline.

  2. These views are used to create the export plugin for executing the extract process.

Task group

There are total four tasks present in this task group (MDM_File_Extract_Customer).

Task Stage_Subscription_Data does the below steps:

  1. This task will extract specific entities along with the regular data extract process.

  2. To extract the entities, create a folder with the name cm_subscription_extract/input in S3 bucket under MDM folder.

The Task File Extract to S3 performs the below step.

  1. This task picks the data available in the views with the help of export plugin and exports the data in the form of flat files to the S3 location.

The Task Update_Extract_Publish_Control performs the below step.

  1. After the successful execution of the above tasks, this task updates the PENDING_EXTRACT status to SUCCESSFUL in MDM_EXTRACT_PUB_CNTL table for all the CUSTOMER entities.

Login to IDP OA platform and find the task group MDM_File_Extract_Customer and then click RUN. After executing the task group successfully, the extract files will be available at MDM_File_Extract_Customer/DataExtract/ located in s3 location.

After successful completion of the task group run, extract files will be available in configured s3 location.

Note:   

The extracted files location will depend on these task group parameters (EXTRACT_S3_BASE_FOLDER, RELTIO_CONNECTION_NAME).This location is a combination of <defaultS3Bucket>/<EXTRACT_S3_BASE_FOLDER>/<Reltio_Connection_Name>/<CurrentDateTimestamp>/

E.g. oaidp-dev-usv-iqviadev-odp/customer_extract/RELTIO_MDM_CM/20210817053631/

Troubleshooting

  • Total three tasks are present in this task group (MDM_File_Extract_Customer).

  • In case of failure in any task, fix the error and restart the task from the failed task till the end. If unable to identify or fix the error, contact MDM support team.